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ABSTRACT 

This paper concerns situations in which a p x p 
covariance matrix is a function of an unknown q x 1 parameter vector 
y-sub-o. Notation is defined in the second section, and some 
algebraic results used in subsequent sections are given. Section 3 
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(G.L.S.) estimators of y-sub-o. Section 4 concerns methods for 
obtaining estimates of parameters in certain linear covariance 
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Generalized Least Squares Esoiinators in the 
Analysis of Covariance Structures 

Summary 

Let S represent the usual unbiased estimator of a covariance matrix, 
J whose elements are functions of a parameter vector 7q* ^^"^ ^^Ic) * 
A genera ni^ed led&L oiiuaies (G>L>i3 ;r estimate, 7 ^ of 7^ may be obtained 
by minimizing tr[{S - L(7)]vf where V is some positive definite matrix. 
Asymptotic properties of the G.L.S. estimators are investigated assuming 
only that ^(7) satisfies certain regularity conditions and that the 
limiting distribution of S is multivariate normal with specified param- 
eters. The estimator of 7 which is obtained by maximizing the Wishart 

-o 

likelihood function (M.W.L. estimator) is shown to be a member of the class 
of G.L.S. estimators with minimum asymptotic variances. When Z(7) is 
linear in 7 , a G.L.S. estimator which converges stochastically to the 
M.W.L. estimator involves far less computation. Methods for calculating 
estimates of 7^ , estimates of the dispersion tnatrix of 7 , and ter.t 
statistics, are given for certain linear models. 

Some key words ; Covariance structures; Generalized least squares; 
Asymptotic distributions. 



Generalized Least Squares Estimators in the 
Analysis of Covariance Structures 

1. Introduction 

This paper will be concerned with situations where a p x p covariance 
matrix, E , is a function of an unknown q x 1 parameter vector : 

Suppose that the p component vectors Xj^ , k = 1,2 • • • n 1 , are 

independently and identically distributed with mean and covariance 

mtrix E • Let S represent the usual unbiased estimator of ob- 
o o 

tained from the . It has been common practice to assume a multivariate 

normal distribution for x or a Wishart distribution for S , and employ 

••it 

maximum likelihood estimators of 7^ . Nonlinear structures (e.g., 
Jflreskog, 1970a) and linear structures (e.g.. Bock & BargDjann, I966; 
Anderson, I969, I97O) have been investigated. Provided that is un- 

structured, maximum likelihood estimators of 7^ based on a multivariate 

normal distribution for x^ . . . x - , or on a Wishart distribution for S , 

-1 -n+1 ^ 

are functions of S only and differ only by a scaling factor, n/(n + l) . 
The choice of maximum likelihood estimators is possibly due to their 
asymptotic efficiency and associated likelihood ratio test. Considering 
a particular nonlinear covariance structure, the unrestricted factor analy- 
sis model, Jflreskog & Goldberger (1972) have shown that a certain general- 
ized least squares estimator also is asymptotically efficient and that a 
corresponding weighted residual sum of squares statistic converges stochas- 
tically to the likelihood ratio statistic. 
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This paper considers estimators of 7^ v/hich are functions of S 

where 



e{s. J = a . . • (2) 

ly oij ^ ' 

The only assumptions about the distribution of elements of S concern 



the asymptotic distribution as n » which is to be the multivariate 
normal distribution with means given by (2) and r.ovariances 

Cov(s..,s ^) = rT^io . a „ + a a . ) . (5) 
^ ij gh^ ^ oig ojh oih ojg' 

This requires only that all fourth order cumulants of the distribution 

of the Xj^ are zero (cf . Cramer, 19^6, pp. 365-566; Kendall & Stuart, 

1969, p. 521). Tlie results to be given apply to, but are not confined to, 

the situation where the have a multivariate normal distribution and S 

-k 

has a Wishart distribution. 

Section 5 will be concerned with asymptotic properties of generalized 
least squares (G.L.S.) estimators of . No specific form will be 
assumed for the covariance structure model. Results will apply to all 
models which satisfy certain regularity conditions. Although S may not 
necessarily have a Wishart distribution one may still obtain estimates by 
maximizing the Wishart likelihood function. These "M.W.L." estimators w^ll 
be shown to have the asymptotic properties of the class G.L.S. estimators 
with minimum asymptotic variances. 

When the covariance structure is linear, G.L.S. estimates may be 
expressed in closed form and are more easily calculated than the M.W.L. 
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estimates. Section k will be concerned with methods for obtaining estimates 
of parameters in certain linear covariance structures • 

The next section defines notation and gives some algebraic results 
which will be used in subsequent sections. 



2. Notation and Preliminary Algebraic Results 



The column vector formed from elements of a p x p niatriA, S , taken 
columnwise will be denoted by Vec(S) or by the coiresponding small letter 
underlined. 

i.e., Vec'(S) = s- = s^^, s^^, s^^, . . s^, s^^, ... s^^^s^y^^^ «pp 

Double siabscripts, ij , are used to denote elements of this vector, the 
first subscript always being nested within the second. Double subscripts 
will also be used to represent rows or columns of certain matrices. For 
example, a typical element of the direct product A » B will be denoted 
by [a a bJ. . , where 

Using this expression it is easily shown that 

(A a B)s = Vec(B S A') (5) 

if A and B are of order m x p fc.nd S is of order p x p . 

The column vector formed from the elements above and including the 
diagonal of a symmetric matrix, S , taken columnwise, will be denoted by 



s 



i.e. s' = S2JL'^j2'^22'hy^2y^^y"' %V ' 

Again, double subscripts, ij , are used to denote elements of this vector, 

the first being nested within the second and not exceeding the second. 

As the p X p Mtrix S is symmetric, the p(p l)/2 x 1 vector s 

2 

may be expressed in temg_af — the — p — tt^ — vector — 5""!^^ — - — - — -- 



s = K*s (6) 
p- ^ ^ 

^ere K is of order p x p{p + l)/2 with typical element 

f'^plij,gh = ^'''(^gV ' ^h^jg^ ' i < p , J < P ; 6 < h < p 

and 8^^. represents Kronecker's delta. Therefore, 

[K ].. = 1 
P 11,11 

[K ]....= [K ]....= 1/2 i / i 

«rh = ° if ij / and ij / hg . 

A left inverse of K is 

P 



2 

which is of order p{p + l)/2 x p with typical element 



tV6h,ij = (2-8g^)[l^],^^g^ , i<P,J<p;6<h<p 
=1 if iJ = gh or ij = hg 
= 0 otherwise. 



This matri^c may be used to express s in terms of s : 

s = k"% . (8) 

2 2 

Let M represent the p x p symmetric idenrpotent matrix 

XT 



4^— K-{jaK_)^23Ll 
P P P P^ P 



= Vp (9) 
with typical element 

^**P^id,gh = 2"^(^g^-h ^ ^ih^jg) ' i<p,d<P,g<P,h< 

This matrix has an interesting property. If A is of order p x m , then 
Mp(A ■ A) = (A a A)M^ . (lO) 
Other properties are: 

p p P ^ ' 

and 

MpS = s . (12) 

The inverse of the matrix Kp(W a W)Kp , where W is nonsingular of 
order p x p , is 

{K»(W m W)Kp)"^ = Kp(w'^ > W"Vp' • (13) 

This result may be verified by multiplication using (9), (lO), and (ll) 
and the inversion rule of direct products (e.g., Searle, I966, p. 2l6): 



K»(W a W) M (W"*"- a w"''")k"' = K»M (W a W)(W a W)"V 

ir P P P P P 



- I 

Let the column vector formed from the diagonal elements of the matrix 

be denoted by either diag(S) or s . The p x p matrix H , with 

— '~ P 

typical element 

= 1 if i = j = g 

= 0 otherwise 

nay be used to select s from s : 

diag(S) = s = H' s . (Ih) 

Let V*W represent the term by term product of V and W with 
typical element [V*W] . . = [v]..[w] . Since 

V^W H^(V a W)Hp , (15) 

V*W is positive semidefinite if V and W are positive semidef inite» 
In subsequent sections it will frequently be convenient to express 
a quadratic or bilinear form involving a direct product as a trace using: 

x'(V a W)y = tr[XVY*W'] (l6) 

where x = Vec(X) and y = Vec(Y) • 

We shall regard the q x 1 vector y as a mathematical variable which 
can assume values and ^ ^ where y is an estimate of y^ • L - ^(z) 



will be regarded as a matrix function of y • When matrix derivatives are 
given, the equality of the functions cr. .(7) and cr..(7) v/ill al>/ays be 

taken into account. Matrices of partial derivatives such as ^^ ' ^ and 
1^^^ Will therefore be syninetric. Z , and 



1 J 
for Z(7) , 



will stand 



^ and 
7=7 



57 



respectively. A similar 



convention will be employed when y = » 



J>» Generalized Least Sc[uares Estimators 



The model given in (l) may be expressed in the equivalent form 



(17) 



V/c shall assume throughout that this model satisfies the following 
regularity conditions: 

(a) All ^-Ay) and all partial derivatives of the first three 

orders with respect to elements of y are continuous and bounded in a 

neighborhood of 7 = 7^ • 
o 

(b) The p"^ X q matrix 



A 



ba(y) 



Z^o 



(18) 



is of full column rank. 



(c) is identified, i.e., I^dj) = ^(Zo^ implies that = 7^ 



(d) 2(7^) is positive definite. 



Let us consider the residual quadratic form, 



{s - cf(7))'{Cov(s,s'))"''"{s - 0(7)) 



(19) 



It follows from the Gauss-Markov theorem that, if o(y) is linear in 7 , 
minimization of this residual quadratic form yields the minimum variance 
unbiased estimator of 7^ • If a(7) is nonlinear, the estimator v/ill be 



asymptotically efficient! 



-1 



In order to obtain {Cov(s,s')) , the matrix of this quadratic 
form, we use (h) to express (5) as 

Cov(s.j,Sg,) = n-l( I {[E^ . EJ. .^g, . [E^ 3 L^]..^^^) 



+ i {[S a 2 ].. ^ + [E a E ]. . . )) 
2 o o'ji,gh o o'ljjhg ' 



?o :.hat 



Cov(s,s') = 2n"^^(E^ a E^)K 



(20) 



Then, (12) shows that the required inverse is 
{Cov(s,s'))-^ = 2-W(E^a 2^)K-' 



(21) 



SO that, with use of (8), the quadratic form (19)^ which we now denote 
by nf(7|E^''") , becomes 



nf iyJlT^) = 2-\is - a(^))'K-(E;l a e;^)k"'{s - c(i)) 

= 2-\{s - a(^))'(E;^ a iT^Hs - aC^)) . (22) 



The matrix of this quadratic form is a function of the unknown dis- 
persion matrix E . We shall therefore replace l"^ by another matri/., 
0 o 

V , and consider G.L.S. estimators which result from minimizing 

f(7|v) = 2"^{s - a(7))'(V a V){s - 0(7)] (23) 

with respect to 7 . The weight matrix, V , will be either a stochastic 
matrix which converges in probability to a positive definite matrix V 
as n -4 00 or a positive definite constant matrix (V = v) . Consequently 
the matrix of the quadratic form in (23) is positive definite or converges 
in probability to a positive definite matrix, V 2 V . Using (I6) this 
quadratic form may also be expressed as: 

f(7|v) = 2'^ tr[{S - L{i)]vf . {2k) 

We shall examine asymptotic properties of the estimators. 

Proposition 1 . The G.L»S. estimators are consistent. 

Proof . Since 7^ is identified and V is positive definite, 
— 2 

tr[{E^ - E(7))v] has its absolute minimum of zero at 7 = 7^. S and V 
converge stochastically to and V and £(7) is bounded in a neighbor- 

hood of 7 - Zv> • Consequently trL{S - E(7))v] converges in probability 
to tr[{E - E(7))v] uniformly in a neighborhood of 7 = 7 . Since 

O — -* ""O 

tr[{S - L{y)]v] is continuous in 7 , the point 7 where it has its 
absolute minimum converges stochastically to . This proof is an 
adaptation of a proof of Anderson & Rubin (1956, pp. l45-l46).ll 
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Proposition 2 . The limiting distribution of a G.L.S. estimator, 
multivariate normal with mean vector 



e(7) = 7o 



and covariance matrix 



Cov(7,7') - 2n"^{9{V))"^9(YE^V){9(V))"^ 



where 0(v) is a q x q matrix function of V defined 

9(V) A'(V V)A 
with typical element [c.f . (l6), (l8)] 



O Vr O 



[e(v)3.. .tr(5^v^v) 



Proof . Let 

h(zlv) = 5^ — = 5|- (V H v){s - g(7)J 

Using (l6), a typical element of this vector may be expressed as 

h.(7lv) = tr[V{S - E(7)]V^ ] . 
By Taylor's theorem 

h(J|v) = h(^Jv) -W(? - y) 
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where 



[W] 



1 

J 



7=7^- 



and 7* li^s between 7^ and 7 



Now, 



^3 ^1 J 1 J 



and 



^ . tr[v(s - ^(z)iV57^ " " % ' 



- V ir- V 



(29) 



(50) 



Since the elements of {S - Z(y )} and (7 - 7 ) converge to zero 

«o " o 

in probability, since the trace functions in (29) and (30) are continuous. 



and since the partial derivatives are asymptotically bounded in probability 
it follows that [W] . converges stochastically to V ) , or 



plim W = A'(V a V)A = 8(v) 
n->qo 



as can be seen from (I6), (18). This matrix is nonsingular- 
Since h(l\v) = 0 , it follows from (28) that 



2 - 7o + W^^(2o 
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and 2 is asymptotically equivalent to 
{H(v))'(s - g^) , 

where 

H(V) = (V B V)A{B(V))-^ , (31) 

because 

(l - i) = {W"-^ - [A»(V a V)A]"^)/i»(V sV){>/Jr (s - a^)] 

+ W"-'A»{(Va V) - (V a V)){VJr (s - a^)] (32) 

converges in probability to the null vector as n . 

Since ^ is a linear function of s , the limiting distribution of 
7 and of 2 multivariate normal with mean vector 

{H(V))'A2:^ = Zo 

and dispersion matrix 

{H(v)}»Kp' Cov(s,sOKp{H(v)) = 2n^^{H(v)) tMpCS^ a E^)Mp{H(v)) . 

This dispersion matrix may be expressed in the form of (26) after use of (5l)^ 
(27), (10), (12), and the fact that each column of A is formed from a 
symmetric matrix, bL^/by. . || 
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All G.L.S. estimators of y^^ , then, are consistent and asyrnptotically 

normally distributed. The "best" G.L.S. (B.G.L.S) estimators, in the 

sense of having minimum asymptotic variances, are obtained by taking V 

to be some consistent estimator of y^l}' where is any positive 

o 

constant. 

Proposition 3 * The asymptotic dispersion matrix of a G.L.S. estimator, 
y , is bounded below by 2n""*"{0(Z^'^))'''^ in the Loewner sense of inequality 
(e.g., Beckenbach & Bellman, 1965, p. 86). This bound is attained, and 
7 is a B.G.L.S. estimator, if V = yCL^ • ( »^ > 0 ) 

Proof . {0(v))""^e(vz^v){e(v))"-^ - (eCz^"^))""^ 

= {H(v) . H(z;^))uz^ « z^){H(v) - ::-(z;^)) 

> 0 

since Z^ Z^ > 0 . || 

In order to prove asymptotic efficiency of B.G.L.S. estimators we 
would have to show that the difference between 2n''"^{G(Z^"^))""^ and the 
inverse information matrix (based on the exact distribution of S ) is 
of the order o(n"'"^) . If S has a Wishart distribution, this difference 
is the null matrix so that all B.G.L.S. estimators are efficient. If we 
assume only that the limiting distribution of S is multivariate normal 
with parameters given by (17) and (20), we can say that B.G.L.S. estimators 
are "efficient in terras of the limiting distribution of S " in the 
following sense: 
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Proposition h . Let Q denote the information matrix based on the 
limiting distribution of S . Then 

lim n[2n"^{e(r,^^))"^ - fi"^] = 0 . (55) 
n-> CO 

Proof * The log of the likelihood function for the limiting multivariate 
normal distribution of s is 

log = constant - | {logiK^{Z(7) . L{7))Yi^ \ + | tr[S{Z(7)}"^ - l]^) 
with first derivatives, 

'1 'i ^i 

and second derivatives 

- Z\S - E)E-1SE-1 ^ ] . (P - l)n-^- triz^ 5^ ' 

Using (16), (17), and (20) it can easily be shown that, if and 
Qg are p x p matrices and B = 0 or 1 , 

e tr{(S - 6E^)Q3^SQ2) = (l - 6) tr(Z^Q^E^Q^) + n'^itr {T.^Q^T.^(^) 

+ tr(E^O^) tr(E^Q^)) . (55) 
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Application of (55) to 0h) then shows that 



[a].. = -e( -x^-r-^ 



1 J 



SO that 



and (55) follows* || 



In addition to yielding a B.G.L.S. estimator of , use of a con- 
sistent estimator of Z^''' for V enables one to test the null hypothesis 

that (1) holds against the alternative that Z is any positive definite 

o 

matrix by means of the residual quadratic form f (2 |v) . 

Proposition 5 > If V = Z^"** and Z^ = Z(i^) , the limiting distribution 
of nf(2|v) = 2"^n tr[{S - Z(2))v]^ is chi-square with p(p + l)/2 - q 
degrees of freedom. 

I^oof * It was seen, using equation (52), that ^(2 - 7) converges in 
probability to a null vector. Also Vn{g(2) ' " Hi - Z^)) converges 
in probability to a null vector since, by Taylor's theorem, 
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where lies between y and y 

Consequently Tn (s • 2(7)) converges stochastically to 

>/n [s - - A(7 - Zo)^ 

= VJT [I - la E^^) A)"^A'(S^^ a. S^^)](s - 

and 

nf(7|v) = 2"ln{s -\g(2))'(V « V){s - 5(7)) 
converges stochastically to 

where 

Since G^{K^(Z^ a 2^)Kp) is idempotent of rank {p(p + l)/2 - q) 
the limiting distribution of nf^ and of nf(2|v) is the central chi- 
square distribution with {p(p + l)/2 - q) degrees of freedom (Graybill, 
1961, p. 83). II 

Anderson (1969^ Section 4), considering linear covariance structures, 
has pointed out certain relationships between equations defining a G-L.S. 
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estimate with V =: E^""" and the Wishart likelihood equations. We shall 
now consider how, for covariance structures in general, an estimate of 

obtained by maximizing the Wishart likelihood function (M.W.L. 
estimator) may be regarded as a member of the class of B.G.L.S. estimates. 

Proposition 6 . Suppose that 7^ is a M.W.L. estimate of and that 

22 is a G.L.S. estimate where V = ^{7-^)V^ • Then is a B.G.L.S. 

estimate and Prob(7^ :f 72)^ 0 as n «> . 

R;oof . Maximizing the Wishart likelihood function is equivalent to minimizing 

P(Z) = jfn|E(2)| - Jfnisi + tr[S{E(2))"^] - p . (56) 
Consequently the equations, 

= -tv[Z-^{S - I.)!.-'- ) = 0 , i = l...q ,(57) 
and the condition that the matrix with typical element 
( ) 

5^ = tr{E-l(2S - E)E-1 ^ - l\s - Dl"- ) (58) 

i J ^i ^ ^i 

be positive definite will be satisfied at the point 7 = 7^ ( ^ = 2(7-^) )• 
The equations 

Sf(7lv) 

= -tr{v(S . E)v ^ ) = 0 , i = 1 ... q , (59) 
i ^i 

and the condition that the matrix with typical element 
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(7 IV) 



i j i j 



(1^0) 



-1 



be positive definite will be satisfied at the point Z ~ ^2 

v={z(2i))-i. 

Using similar reasoning to that used in the proof of Proposition 1 

it can be shown (c.f. Anderson & Rubin, 1956, Theorem 12. l) that the M.W.L. 

estimator, 7^ , is a consistent estimator of 7^ . Consequently {£(7^)) 

is a consistent estimator of E and 7^ is a B.G.L.S. estimator. 

o -2 

Equations (39) and (37) are equivalent when V = . Con- 

sequently 7 = 2i always a stationary point of f (7 I {£(7^))"'^) and 
will not be at a minimum only if the matrix with typical element given 
by (kO) is not positive definite. Since the matrix with typical element 
(38) is positive definite at 7=7^ and since the difference 



hy.dy. 
1 



Z=2i 



-2 tr 



E-1(S - Z)Z-^ ^ ^ 



Z=Zi 



converges stochastically to zero, the probability that the matrix with 
typical element (kO) is not positive definite at the point 7 = Zi ^^^^^ 
to zero as n 00 . This implies tliat the probability that the point 7^ 
at which F(y) has an absolute minimum does not give at least a relative 
minimum of f(7 |{E(2^))""^) tends to zero as n « . Since f (7 I {2(7^))"-^) 
is convex in a neighborhood of 7^ and since 2^ and 7^ both converge 
stochastically to 7^ , the probability that there is a minimum at 2^ 
which does not coincide with the absolute minimum at tends to zero as 



n 00 



This result implies that M.W.L. estimators will have the asymptotic 
properties of B.G.L^S- estimators provided only that the limiting 
distribution of S is the multivariate normal distribution specified 
earlier (and that the model satisfies the specified regularity conditions). 
No assumption of a Wishart distribution for S has been made. 

Jflreskog & Goldberger (1972) have shown that the log likelihood 
ratio test statistic and a certain residual quadra tic -form converge in 
probability in the particular case of unrestricted factor analysis. For 
covariance structures in general we may state: 

Proposition 7 > If 2 is a B.G.L.S. (or M.W.L.) estimator, nF(7) and 
nf(2l{2:(7))"-') converge stochastically and have a limiting chi-square 
distribution with p(p + l)/2 - q degrees of freedom. 

Proof . Rearrangement of terms in (36) gives 

F(7) = tr{(S . L)r^) - /nil Ms - t)r^\ . 

Using Taylor expansions in eigenvalues of (S - L)!,^^ , It is easily shown 
that 

-Jfnil 4- (S - £)r^| = E k"^ tr{.(S . . 

k=l 



Consequently, 
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nF(7) = nf(7|z"^) + n Z k"^ tr{(Z - S)^"-^) 

k=5 

= nf(7|rl) f Op(l) . 

The limiting distribution of nf (7 Iz'"^) follows from Proposition 5* II 

Consequently either nf (7 Iz"^"^) or aF(7) may be used in a large 
sample test of the null hypothesis that (l) holds when 7 is a M.W.L. 
estimate. For many covariance structures t'le form of F(y) given in (56) 
simplifies at the minimum. 

Proposition 8 * Suppose that L(y) is such that, given any admissible 
7 and any positive scalar a , there is an admissible 7* for which 
2(2^) = oiL(l) . Then, if 2 is a M.W.L- estimate, trCSS" ] = p so 
that 

F(2)=M2l - Msl . 

This result was stated by Bock & Bargmann (1966, p. 521) for certain 
specific covariance structures. Their proof, however, applies to the 
general situation considered here. 

k* Linear Covariance Structures 

When 2(7) is nonlinear, a successive approximation procedure, such 
as Newton^s method, is required to obtain both G.L.S. and M.W.L' estimates. 

General expressions for the necessary derivatives are given in (5T)> (38), 

o 

(39)} and (ho)* When the specific forms of bz/by . and S S^7.^7. are 
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known, these expressions may be simplified using methods given by Bargmann 
(1967, Section 7). 

When L{y) is linear in 7 , on the other hand, G.L.S. estimates my 
be expressed in closed form. A successive approximation procedure is still 
usually required for M.W.L. estimates (except in some special cases such as 
the conrpound symmetry model). 

We can always express a linear structure £(7) in the form 

oil) = (kl) 

where A ( = bo/by ^ ) is a known matrix of order p^ x q and rank q * 
Use of (39)^ (16), and (5) then shows that the G.L.S. estimates of 
7^ are: 

Z = {B(V))"-^A» Vec(VSV) {k2) 

where 

0(V) = A»(V ta V)A . 

VJhenever A is of full column rank and V is positive definite, f (7 |v) 
is convex and has a unique minimum at 7 = 7. 0(v) then is positive 
definite. 

If V is a fixed matrix (e.g., V = I ) , or a stochastic matrix 
distributed independently of S , 7 is an unbiased estimator of 7^ . 
If y is a consistent estimator of E^"^ (e.g., V = s""^ or V = {£(7))""^ 
2 is a B. G.L.S. estimator of 7 and 2n'""^{0(v))'""^ is a consistent 
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estimator of the asymptotic covariance matrix of 7 (Proposition M^o, 
the statistic 

nf(2|v) = 2'\ tr[(S - Z)v(S - £)V] 

= 2"^n[s»(V 2 V)s - 7»{0(V))7] 

is approximately distributed as chi-square with {p(p + l)/2) - q degrees 
of freedom if n is large and the null hypothesis 0^ = dq^^ holds 
(Proposition 5)* 

The M.W.L. 2 defined by (k2) with V replaced by {Liy)]'^ , 
and will simultaneously be a G.L.S. estimate in the sense of minimizing 
f (7 1(2(7))""^) (Proposition 6) whenever L(^) is positive definite. This 
M*W.L. estimate may be calculated by means of a successive approximation 
procedure: 

1. / Use (k2) with V = S"''' to obtain 2(i) • 

2. / Use (k2) with V = {£(7^^^))"'^ to obtain 7^^) • 
3*/ Continue in this way until the differences 9^^^-^'^ " 
become sufficiently small. 

It is easily shown that this successive G.L.S. procedure is equivalent to 
the Fisher scoring method (Kendall Sc Stuart, I967, pp. kS-kS) for obtaining 



M.W.L. estimates. (When T,(y) is not linear in 2^ ^ however, minimizing 

ffz 1(^(2(1)))"^) 

scoring method. ) 



f (2; |{E(2^j)) ^) to obtain 2(i+i) longer equivalent to the Fisher 



The successive O.L.S. estimators 2(i)'2(2)'^(5) 
estimators and have the same asymptotic properties. It is therefore 
difficult to justify the calculation of precise M.V/.L. estimates, particu- 
larly if more than three or four iterations are required. 

McDonald (1972) has investigated patterned covariance structures where 
subsets of elements of E are equal or have a known value, usually zero. 
In such models, where elements of A are either 1 or 0, (^^2) would be 
employed without further algebraic manipulation to provide G.L-S. estimates. 
Use of (k) would avoid storage of the large matrix V a V by a computer 
program. 

. In other linear covariance structures, however, A involves direct 
produ:;ts of certain matrices and (^^2) may be simplified considerably. We 
shall now examine such models in greater detail. They are of the form 

Z = AW + (^3) 

whsre the p x m "model matrix" A is known and of full column rank, 

^ is symmetric of order M , and is diagonal of order p . Models 

of this kind have been discussed by Bock & Bnrgmann (1966, p. 510), 

Mukherjee (1970), and Jttreskog (l970a. Sections 2.k and 2*5)- Newton 

methods for obtaining M.W.L. estimates of 4)^ and are available 

o 

(Bock & Bargmann, I966; Anderson, 1970) and the methods proposed by 
J3reskog (1970a) may also be employed. 

It will be convenient to consider separately the cases where * is 
diagonal, C) = D , and where ^ is symmetric but not diagonal. 
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Case 1, 4) is diagonal * 

When 4= , (^3) may be expressed in the form of (^^l) vith 

A = {(A a A)H^, Hp) , 

yj = = {diag'(D^), diag»(D^)) , 

q ~ m + p 

Then, using (l5)> it can be shown that 

((A*VAW(A'VA) (a*vW(a*v)\ 
(va)*(va) v^v / 

and, using (5) and (l^), that 

(diag(A*VSVA)\ 
diag(VSV) / 

Substitution of (kk) and (k^) in (^2) now provides the estimate y • 
The matrix to be inverted, 0(V) , is positive semidefinite provided that 
V is positive definite • Singularity of the matrix implies that is 
not identified. 

We have minimized f(7|v) without imposing any constraints and some 
elements of y could be negative. The elements of ~ ^^i^-i^ ^ however, 
represent variances (cf . Bock & BRrgmann, 1966) so that it would be pref- 
erable for the elements of 7 to be nonnegative. Minimization of 
f(^|v) subject to the inequality constraints 

> 0 , i = 1 ... q (k6) 
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may be accomplished by applying the- "sweep" oi-orator (D'mpster, 196',), Sec- 
tion lj.3.2; Morgan & Tatar, 1972) to the symmetric matrix t of order 
q + 1 which is defined initially as 



'^1 3l2 
^ii2 ^2. 



where 



\l ^ 0(V) as defined in (kk) , 

3l2 " Vfec(VSV) as defined in (i^5) , 

- s^(V V)s - tr(VSVS) . (i<7) 

The superscript ^ will be used to indicate that the sweep operator 
has been applied on a particular row of Q . An element of q^^ > '•^12 •'i 
lies in row i^ of Q on which the sweep operator has been applied. 
Applying the reverse sweep operator on the same row of Q cancels the 
sweep operation so that [^3_2]* becomes [q3_2^i • 

The minimization algorithm is: 
1/ Sweep Q on row i if [q-j_2]i >0 * 

2/ If 1./ results in a ■ , in a row j* ^ i* on which Q has 

previously been swept, becoming negative, reverse sweep Q on row : 

t'il2^5-*f^l2lo<° • 



>l Continue until aU [q^^g^i ^ ^ '"^12^1 ^ ' i or i^ 

The sweep operator is never applied on the last row of Q . 
Then 7 is given by 

= '■1:12-'i ' ^ been swept on row i = i* 

=0 , if Q has not been swept on row i 

and nf(7|v) may be obtained from 
nf(7|v) = |q2£ ' 

Since 



Sf(r (v) 



^=£ - ""'^^i2-'i ^ "^^ ^ '^^^ '^^^ been swept on row 
= 0 , if Q has been swept on row i = 



the Kuhn-Tucker conditions are satisfied, 

7. > 0 
' 1 — 

^f(zlv) 



. ^f (7 Iv) 

and J is a global minimum of f (2; )v) subject to the inequality con 
straints (It 6 J (Fiacco & McCormick, I968, pp. 89-90). 

The sweep operator may then be applied on the remaining rows of 
Olj^ (where W^\^ < 0 ) to obtain {©(V))""'' . 
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In come cases somf^ eiciatnis of y ^ niay be in knov/n rati^. . r 
example, suppose that 

o 

v/here D^,^ is a Kivo\ni dia^^onal n.atrix (e.g., D = I )• Tnen 7' ) , 

q - tn i 1 , and e.stiL^aoes arc obtained as before with 




and defined by {k"j) . 

Similar procedures may be employed when other elements of 7^ are 
equal or in known ratio. 

Case II* (I) is symmetric . 

In (4l) we now have 

A - {(As k)Yr^, Hp] . 

q = f . -i l)/23 + p . 

After some algebra, making use of the methods of Section 2, Q\2) can 
be simplified to: 

I 
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- (S - D^B (^48) 
I = W diag[VSV - GSG] (1)9) 

where 

B - VA(A'VA)""'" 
G ^ Vkik'Vky'^k'V 
W =- (V*V - G^c-G)"""" 

The matrix to be inverted to give W is a submatrix of (V + G) a (V - O) 
and is therefore positive setnidef inite provided that V is positive def- 
inite. Singularity of the matrix implies that is not identified. 

It is of interest to note that, although the number of parameters to 
be estimated in Case II is greater than that in C;^se 1, the largest matrix to 
be inverted in (48), (k^) is of order p while the inversion of a matrix 
of order (p + m) is required when {kk)^ (k^), {h2) are employed. 

Taking Q^^ = (V-'^V - G^G) , = diag[VSV - GSG] , and = 

ti{VSVS - GSGS] and replacing £ by | , the algorithm described under 
Case I may be employed to give a f satisfying the inequality constraints 

> 0 , i = 1 ... p . (50) 

When f has been obtained, $ may be obtained from (16 )• This gives the 
absolute minimum of f(7|v) subject to the inequality constraints (50). 
It is possible that $ , an estimated dispersion matrix, will not be 
positive semidefinite. To ensure that $ is positive semidefinite one 
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could replace $ by TT* , but the model would then no longer be line 
and the estimates vould be more difficult to obtain. 

If V is a consistent estimator of E , v/e have 



Cov(7,7») = 2n"^{e(v)3"^ , 



with elements 



Cov(^.,^.) = 2n"-'-w. . , 
1- 3 10 



where 



Cov($. = n (c. c + c.^c . +2 Z E b .b .w b b , 

^ iy gh' ^ ig oh ih jg ri rj rs sg sh 



The case where the elements of t are in known ratio. 

=o ' 



o 



may be treated as in Case I. Taking 
w = {g'(V*V - Q*G)g]'^ 



we have: 



q = {m(m + l)/2) + 1 , 
f = wg' diag(VSV - GSG) , 
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<D = B»(S - fdjB , 



C^ov($. u) = "'■'"(c. c .. + c. c . + [B'D B]. .[B'D B] . ) 
ij' gh' ^ ig jh ih og a 'xz a gh' 



Formulae, both in Case I and Case II, simplify in an obvious manner when 

V = S . When maximum likelihood estimates are being obtained and 

V - (A$A' + D^) ^ the follov/ing well-known identities may be employed 
to reduce computation if |d^| / 0 : 

'^-l '^-l ,^-1 .^-l \-l .'^-l 
V = - \A($ V A»D^^A) A'D^ , 

(A»VA)"^A»V = (A»D^^A)"^A»D^^ . 

We note, also, that Proposition 8 appliec> in both Case I and Case II. 

The Fisher scoring algorithm employed here for obtaining M.W.L. 
estimates may require more iterations to attain convergence than existing 
Newton algorithms, but less computation is required during each iteration. 
This reduction in computation per iteration is particularly noticeable in 
Case II. 

The B.G.L.S. estimates obtained using S ^ for V require less 
computation than the M.W.L. estimates and have ths same desirable asymptotic 
properties. Small sample properties of the estimators are as yet unknown. 
In a Monte Carlo experiment (Durand, 1971) use of S""^ for V gave 
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estimates which appeared more biased ( £1(7) < 7^ ) M.V/.L. esti- 

mates but which, however, appeared to be as precise in terms of mean 
squared error of estimation. Also, in practical applications of both 
Case I and Case II procedures, the author has observed that taking V S 
tends to give estimates which are slightly smaller than the M.V/.L. estimates. 
A similar tendency in factor analysis was noted by j5reskog & Goldberger 
(1972) . 

This tendency is apparent in the example given in Table la. It 
shows G.L.S. estimates ( V - I , V = s'"^ ) and M.W.L. ( V = S'"^ ) estimates 
of parameters in a quasi-simplex model based on a covariance watrix obtained 
by Bilodeau (1957) in a study of a two-hand coordination task. This matrix 
has been reported by and analyzed by Bock & Bargmann (I966) and by 
Jtireskog (1970b). The model is: 

Z = AD A» + ^I 

where 

a, .= 1 , P>i>J>l 
= 0 , i < j . 

It can be seen that the G.L.S. estimates with V = S'"'' and the M.V/.L. 
/ ^ —1 \ 

estimates ( V = E ) agree rather closely and differ somewhat from the 
unweighted least squares estimates ( V = I ). 

The successive G.L.S. (Fisher scoring) algorithm for obtaining M.W.L. 
estimates converged to four figures on the third iteration. Estimates of 
standard errors and values of the test statistics are given in Tables lb 
and Ic. 
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Table !• Bilodeau's Example. 



a) Estitnates of parameters in a quasi sinplex model. 



V 






^2 






*^ 


*6 


1" 


I 


50I+.I 


65.5 


51.1 


121+.6 


56.7 


22.7 


19.5 




1+52.3 


55.1+ 


15.1+ 


7I+.I+ 


20.6 


0.0 


I+I+.5 


s 


1+82.6 


5I+.6 


15-9 


81.1^ 


21.6 


1.5 


1+5.5 






b) Estimates of 


standard 


errors • 


diag2{76e(V ))■•'• . 




V 






^2 








^6 




s" 


-1 


56.9 


II+.6 


10.2 


11+.5 


9.5 


10.1 


1+.8 


2 


■1 


58.7 


11+.6 


10.2 


11+.9 


9.6 


10.2 


1+.7 






c) 


Test statistics. 


d.f. = 11+ 


. n = 


152 . 





V 


nf(7lv) 


n¥(9) 




9.51+ 






9.21+ 


9.1+6 
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